Llama 3.1 RoPE #14

danielhanchen · 2024-07-24T06:22:31Z

Beware - my C is very rusty (haven't done C in like ages lol) - I might have transcribed it incorrectly from https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L1116

From https://news.ycombinator.com/item?id=41053201
Llama 3.1 uses a new RoPE scaling mechanism for 128K context extension using:

# From https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/api/model.py#L41
def apply_scaling(self, freqs: torch.Tensor):
    # Values obtained from grid search
    scale_factor = 8
    low_freq_factor = 1
    high_freq_factor = 4
    old_context_len = 8192  # original llama3 length

    low_freq_wavelen = old_context_len / low_freq_factor
    high_freq_wavelen = old_context_len / high_freq_factor
    new_freqs = []
    for freq in freqs:
        wavelen = 2 * math.pi / freq
        if wavelen < high_freq_wavelen:
            new_freqs.append(freq)
        elif wavelen > low_freq_wavelen:
            new_freqs.append(freq / scale_factor)
        else:
            assert low_freq_wavelen != high_freq_wavelen
            smooth = (old_context_len / wavelen - low_freq_factor) / (
                high_freq_factor - low_freq_factor
            )
            new_freqs.append((1 - smooth) * freq / scale_factor + smooth * freq)
    return torch.tensor(new_freqs, dtype=freqs.dtype, device=freqs.device)

Did not add a flag to enable Llama 3.1 scaling though

Beware - my C is very rusty (haven't done C in like ages lol) - I might have transcribed it incorrectly from https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L1116

Llama 3.1 RoPE

8a0ad84

Beware - my C is very rusty (haven't done C in like ages lol) - I might have transcribed it incorrectly from https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L1116

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Llama 3.1 RoPE #14

Llama 3.1 RoPE #14

Uh oh!

danielhanchen commented Jul 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Llama 3.1 RoPE #14

Are you sure you want to change the base?

Llama 3.1 RoPE #14

Uh oh!

Conversation

danielhanchen commented Jul 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant